g
g
,
s, which may be differential, were missed (hence the Type II
ue to the high extreme outliers occurring in the cancer replicates.
. A high extreme outlier in gene 216987_at in GDS3139 missed a true DEG.
PA
er profile outlier analysis algorithm (COPA) was perhaps the
ne for discovering heterogeneous DEGs [Tomlins, et al., 2005].
as modified the t statistic as a ratio of the distance between the rth
value was the 9th) percentile of the case expressions and the
of all expressions over the median absolute distance (deviated
whole population median of the gene). The COPA t statistic is
s below,
ݐைൌݍሺܡሻെߣ
ߪ
(6.10)
ൌ1.4826 ൈmedianሼܠെߣ, ܡെߣሽ, ܠ stands for a vector of the
xpressions and ܡ stands for a vector of case expressions, ݍሺܡሻ
r the rth percentile of ܡ and ߣ is the median of both ܠ and ܡ. The
on of the rth percentile is irrelevant to the number of outliers.
e outlier number is small, the difference between ݍሺܡሻ and ߣ is
owever, when the outlier number is large, the difference between
d ߣ is large. The rth percentile also depends on the distribution of
ta set. The COPA p values are calculated by the permutation
.